Efficient distributed discovery of bidirectional order dependencies
نویسندگان
چکیده
Abstract Bidirectional order dependencies (bODs) capture relationships between lists of attributes in a relational table. They can express that, for example, sorting books by publication date ascending also sorts them age descending order. The knowledge about is useful many data management tasks, such as query optimization, cleaning, or consistency checking. Because the bODs specific dataset are usually not explicitly given, they need to be discovered. discovery all minimal (in set-based canonical form) task with exponential complexity number attributes, though, which why existing bOD algorithms cannot process datasets practically relevant size reasonable time. In this paper, we propose distributed algorithm DISTOD, whose execution time scales available hardware. DISTOD scalable, robust, and elastic approach that combines efficient pruning techniques candidates form novel, reactive, search strategy. Our evaluation on various shows outperforms both single-threaded state-of-the-art up orders magnitude; it can, particular, much larger datasets.
منابع مشابه
Efficient discovery of functional dependencies with degrees of satisfaction
Functional dependency (FD) is an important type of semantic knowledge reflecting integrity constraints in databases, and has nowadays attracted an increasing amount of research attention in data mining. Traditionally, FD is defined in the light of precise or complete data, and can hardly tolerate partial truth due to imprecise or incomplete data (such as noises, nulls, etc.) that may often exis...
متن کاملEfficient Discovery of Functional and Approximate Dependencies Using Partitions
Discovery of functional dependencies from relations has been identified as an important database analysis technique. In this paper, we present a new approach for finding functional dependencies from large databases, based on partitioning the set of rows with respect to their attribute values. The use of partitions makes the discovery of approximate functional dependencies easy and efficient, an...
متن کاملEfficient Discovery of Functional Dependencies and Armstrong Relations
In this paper, we propose a new efficient algorithm called Dep-Miner for discovering minimal non-trivial functional dependencies from large databases. Based on theoretical foundations, our approach combines the discovery of functional dependencies along with the construction of real-world Armstrong relations (without additional execution time). These relations are small Armstrong relations taki...
متن کاملEfficient discovery of similarity constraints for matching dependencies
Article history: Received 15 December 2011 Received in revised form 12 June 2013 Accepted 12 June 2013 Available online 29 June 2013 The concept of matching dependencies (MDs) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), MDs can also be applied to various data quality applications such as detecting ...
متن کاملDiscovery of Paradigm Dependencies
Missing and incorrect values often cause serious consequences. To deal with these data quality problems, a class of common employed tools are dependency rules, such as Functional Dependencies (FDs), Conditional Functional Dependencies (CFDs) and Edition Rules (ERs), etc. The stronger expressing ability a dependency has, data with the better quality can be obtained. To the best of our knowledge,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Vldb Journal
سال: 2021
ISSN: ['0949-877X', '1066-8888']
DOI: https://doi.org/10.1007/s00778-021-00683-4